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A.  DIRECTOR’S  OVERVIEW 


This  document  is  the  second  year  annual  report  of  the  Cornell  Joint 
Services  Electronics  Program  for  the  period  from  May  1, 1990  to  April  30,  1993. 
The  present  Cornell  program  carries  two  themes:  femtosecond  carrier 
processes  in  compound  semiconductors,  and  real  time  signal  processing.  The 
program  has  advanced  according  to  the  plan.  Eight  task  investigators.  Profs. 
R.  Shealy,  C.  Tang,  C.  Pollock,  P.  Krusius,  G.  Bilardi,  F.  Luk,  A.  Bojanczyk,  and 
H.  Tomg,  with  their  graduate  students  have  contributed  to  JSEP  research  this 
year.  Early  during  this  year  G.  Bilardi  left  Cornell  University  in  order  to  take 
a  position  at  the  Italian  University,  Universita'  di  Padova,  Dipartimento  di 
Electtrotechnica  ed  Informatica.  A  substitute  task  proposed  by  Prof.  Adam 
Bojanczyk  was  approved  starting  September  30,  1991.  The  integration  of 
research  under  each  of  the  two  themes  has  progressed  according  to  plan  and 
joint  publications  are  appearing.  Eleven  graduate  students  have  been 
partially,  or  fully,  supported  by  JSEP  this  year.  A  total  of  29  publications  and 
eight  theses  were  prepared  in  this  period  and  are  now  various  stages  of 
processing.  Eight  PhD  degrees  have  been  awarded  to  JSEP  supported  students 
during  this  reporting  period. 

B.  DESCRIPTION  OF  SPECIAL  ACCOMPLISHMENTS  AND  TECHNOLOGY 
TRANSITION 

B.l.  Femtosecond  Carrier  Processes  in  Compound  Semiconductors 

Several  significant  achievements  have  been  reached  in  the  research 
performed  under  the  compound  semiconductor  theme.  The  new  off-campus 
organometallic  vapor  phase  epitaxial  (OMVPE)  compound  semiconductor 
materials  growth  facility  has  been  completed  and  outstanding  OMVPE  films 
have  been  grown  under  the  leadership  of  R.  Shealy.  The  facility  has  a  total 
area  of  5,000  sq  ft,  with  1,800  sq  ft  of  class  10,000  clean  room,  and  will  house  3 
OMVPE  reactors,  the  first  of  which  became  operational  during  this  reporting 
period.  The  facility  design  sets  new  standards  in  New  York  State  for  handling 
highly  toxic  hydride  process  gases  by  exceeding  even  the  stringent  code  set  by 
the  State  of  California.  Undoped  GaAs  films  have  been  grown  in  this  multi¬ 
chamber  OMVPE  reactor  using  triethylgallium  (TEG)  and  arsine  source  gases 
with  at  very  low  V/m  flow  ratios.  Films  grown  above  600  C  are  n-type  and 
have  carrier  concentrations  typically  less  than  5xl014cnr3  with  77K  mobilities 
exceeding  101,000  cm2/V-s.  The  second  OMVPE  reactor  is  being  readied  for 
deep  UV  stimulated  selective  OMVPE  growth  for  exciting  new  structures. 
Very  high  average  power  of  high  repetition  rate  femtosecond  pulses  in  the 
blue  have  for  the  first  time  been  generated  by  C.  Tang's  group  via  intra-cavity 
doubling  of  a  mode-locked  Ti:sapphire  laser  using  a  BaB204  crystal.  The  same 
group  has  also  set-up  a  hot  luminescence  based  sensitive  time-resolved 
spectrometer.  This  technique  has  been  used  to  study  the  relaxation  dynamics 
of  hot  carriers  in  III-V  compound  semiconductors.  C.  Pollock’s  group  has 
perfected  its  unique  tunable  color  center  laser  based  femtosecond  pump-and- 


probe  characterization  system  for  narrow  band  gap  semiconductors.  Carrier 
relaxation  data  for  excitations  from  the  band  edge  up  to  a  few  optical  phonon 
energies  in  InGaAs  thin  films  is  being  measured.  The  Monte  Carlo 
simulation  group  of  J.  P.  Krusius  has  completed  the  dual  carrier  code  for  the 
time-dependent  simulation  of  non-equilibrium  transport  in  two- 
dimensional  heterostructure  devices,  such  as  heterojunction  bipolar 
transistors.  In  parallel  this  group  has  been  collaborating  with  C.  Pollock's 
efforts  to  develop  and  analyze  the  femtosecond  pump-and-probe 
measurements  on  narrow  band  gap  semiconductors.  Most  recently  the  ability 
to  model  the  effect  of  dynamic  screening  on  carrier-optical  phonon  and 
carrier-carrier  scattering  has  been  added.  A  qualitative  agreement  between 
theory  and  experiment  for  InGaAs  thin  films  has  been  reached. 

The  Optoelectronics  Technology  Center  (OTC),  established  in 
September  1990  under  DARPA  support,  with  primary  participant  from 
Cornell  University,  University  of  California  Santa  Barbara,  and  University  of 
California  San  Diego  is  in  its  second  year.  C.  Tang  continues  as  one  of  the 
leaders  of  this  multi-university  program.  The  Cornell  part  of  the  OTC 
proposal  leveraged  past  JSEP  research.  The  OTC  had  its  first  annual  review 
meeting  at  Cornell  University  in  October  1991.  In  addition  to  C.  Tang,  R. 
Shealy,  C.  Pollock,  and  P.  Krusius  of  the  JSEP  investigators,  are  involved  in 
the  OTC  research  program. 

Further  special  accomplishments  are  listed  in  the  description  of 
research  under  each  of  the  tasks. 

B.2  Real  Time  Signal  Processing 

The  investigators  involved  in  the  real  time  signal  processing  theme. 
Profs.  G.  Bilardi,  A.  Bojanczyk,  F.  Luk,  and  H.  Torng,  have  continued  the 
synergistic  work.  F.  Luk's  group  discovered  the  relationship  between  the 
Berlekamp-Massey  algorithm  for  decoding  the  Reed-Salomon  code  and  the 
well  known  Lanczos  algorithm.  H.  Torng's  group  has  made  significant 
progress  in  the  instruction  issuing  mechanism,  interrupt  handling,  branch 
prediction  and  multi-stream  processing,  all  problems  arising  in  efforts  to 
design  faster  superscalar  computing  machines.  While  A.  Bojanczyk  is  new  to 
this  group,  he  will  bring  a  more  hardware  oriented  approach  and  longer  term 
impact  the  program  considerably. 

H.  Torng  organized  the  third  "Project  2000"  meeting  in  June  1991  at 
Cornell  to  report  on  computer  engineering  advances  in  the  past  year.  About 
15  industrial  representatives  attended  this  two  day  meeting.  F.  Luk  organized 
an  SPIE  meeting  on  advanced  signal  processing  algorithms,  architectures,  and 
implementations.  The  proceedings,  which  he  edited,  included  46  papers  and 
covered  494  pages.  F.  Luk  together  with  A.  Bojanczyk  were  awarded  a  Warp 
computer  by  DARPA.  This  GE  built  machine  was  installed  at  the  Cornell 
Engineering  and  Theory  Center  building  in  September  1990.  Also  a  group  of 
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INMOS  transputers  has  been  installed  at  the  E&TC  for  exploratory 
computation. 

Further  special  accomplishments  are  listed  in  the  description  of 
research  under  each  of  the  tasks. 
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Task  #1 

Task  Principal  Investigator:  James  R.  Shealy 

(607)  255-4657 


OBIECTTVE 

The  program  objective  for  the  materials  task  is  to  explore  the  use  of  OMVPE 
to  produce  novel  epitaxial  structures  for  new  high  speed  electron  devices.  The 
emphasis  is  to  extend  the  capabilities  of  the  OMVPE  process  to  include  sub¬ 
micron  selective  growth  of  high  mobility  electron  channels,  as  well  as,  to  use 
conventional  OMVPE  techniques  to  produce  high  quality  III-V  alloys,  both 
lattice  matched  and  pseudomorphic,  for  the  device  fabrication  and  carrier 
relaxation  studies  ongoing  in  other  related  JSEP  program  tasks  (2-4).  The 
program  objectives  are  currently  proceeding  in  a  series  of  stages  involving  the 
operation  of  the  new  OMVPE  facility  at  Cornell  (which  includes 
environmental  testing)  ,  the  construction  of  an  additional  OMVPE  reactor  for 
the  selective  growth  studies,  and  the  development  of  advanced  optical 
probing  techniques  for  the  non-destructive  characterization  of  2  dimensional 
epitaxial  structures. 

DISCUSSION  OF  STATE-OF-THE-ART 

The  following  discussion  of  the  state-of-the-art  is  organized  into  separate 
sections  on  photo-stimulated  selective  OMVPE,  the  properties  of  phonons  in 
strained  semiconductor  short  period  superlattices,  and  the  operation  of  a  safe 
OMVPE  process  with  hydrides. 

Selective  OMVPE  Deposition  with  Deep  UV  Radiation 

Deep  UV  photo-assisted  OMVPE  growth  is  one  of  the  most  promising  paths 
to  realizing  in  situ  submicron  selective  growth.  The  proposed  approach  for 
selective  OMVPE  on  the  submicron  scale  utilizes  tunable,  coherent  deep  UV 
radiation  in  contrast  to  all  previous  attempts.  The  reactants  used  in  OMVPE 
are  generally  transparent  in  the  visible  and  infrared  portion  of  the  spectra. 
Optical  activation  of  the  growth  has  been  achieved  using  visible  laser 
radiation,  where  the  reactants  on  the  growth  surface  are  not  directly  excited. 
The  growth  is  apparently  activated  by  local  thermal  heating  or  by  carrier 
generations  near  the  growth  surface  [1].  These  intermediate  steps  will  most 
likely  prevent  a  high  resolution  selective  growth  process  due  to  thermal 
and/or  carrier  diffusion  prior  to  the  activation  of  the  growth.  The  deep  UV 
approach  also  has  its  limitations.  Previous  studies  using  an  ArF  excimer  laser 
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excitation  (193  nm)  have  found  at  best  a  2:1  growth  ratio  in  the 
illuminated:dark  regions  of  the  substrate.  Also,  substantial  carbon 
concentrations  [@  (10^)  cm'3]  are  observable  in  the  selectively  deposited 
GaAs  films  [2].  The  laser  enhanced  growth  was  found  to  be  severely  limited  by 
absorption  in  the  gas  phase.  In  spite  of  this,  enhanced  growth  was  obst  rved 
using  10  Hz  pulse  trains  with  energies  as  low  as  10  mj/cm^/pulse  with 
trimethylgallium  (TMG)  and  arsine. 

With  other  approaches,  a  visible  laser  stimulated  Atomic  Layer  Epitaxial 
(ALE)  process,  the  selectivity  has  been  dramatically  improved  [3]. 
Unfortunately,  this  process  suffers  from  limited  dimensional  control  for  the 
reasons  stated  above  and  considerable  carbon  contamination.  In  all  previous 
studies  the  dimensional  resolution  of  the  selective  growth  process  (best 
linewidth)  exceeds  10  Jim. 

With  a  tunable  deep  UV  source  it  becomes  feasible  to  perform  selective 
deposition  where  the  radiation  is  incident  on  the  growth  surface,  and  also  to 
selectively  excite  reactant  species  in  the  vapor  phase  or  in  the  adlayers  on  the 
surface.  Furthermore,  the  Ga,  or  A1  sources  may  be  selectively  excited  as  the 
absorption  edges  are  distinctly  separated  in  the  UV  spectrum.  To  illustrate 
these  excitations,  the  absorption  spectra  of  TMG,  trimethylaluminum  (TMA), 
and  the  surface  chemisorbed  species  using  TMG/ASH3  exposures  are 
presented  in  figure  1  [4]. 


Figure  1.  Room  temperature  UV  absorption  spectra  for  TMG,  TMA,  and  the  chemisorbed  surface 
layers  caused  by  TMG  and  ASH3  exposures  on  a  silica  substrate  surface.  The  chemisorbed  surface 
adlayers'  spectra  were  obtained  by  subtracting  the  TMG  vapor  phase  spectrum  from  the 
measured  data  (Ref.  4). 


As  is  evident  in  figure  1,  at  an  excitation  wavelength  of  220  nm,  the  surface 
phase  may  be  excited  without  appreciable  absorption  in  the  vapor  phase.  Note 
that  previous  studies  using  193  nm  excimer  laser  radiation  resulted  in 
excitation  of  both  the  vapor  and  surfaces  phases,  a  problem  which  was  dted  to 
degrade  the  stimulated  growth  rate  and  the  dimensional  selectivity  of  the 
process[2].  The  tunable  deep  UV  source  (frequency  doubled,  excimer  pumped, 
dye  laser  system)  will  allow  the  use  of  coherent  radiation  from  170  to  220  nm. 
By  selecting  the  laser  wavelength,  it  becomes  feasible  to  selectively  excite  the 
Ga  or  A1  vapor  species  using  TMG  and  TMA  by  operation  at  170  and  210  nm, 
respectively.  These  experiments  may  allow  a  modulation  in  the  alloy 
composition  x  in  AlxGai-xAs  films  under  the  proper  growth  conditions.  It 
should  be  noted  that  the  absorption  feature  for  the  TMA  vapor  near  175  nm  is 
due  to  the  presence  of  the  dimer  species  (not  seen  with  TMG)  which 
transports  near  room  temperature  from  the  bubbler.  As  the  vapor  heats  in  the 
boundary  layer  above  the  substrate,  where  the  photo-excitation  takes  place, 
the  dimer  species  partial  pressure  is  substantially  reduced  (observed  by  a 
diminishing  175  nm  absorption  feature).  This  may  prevent  selective 
excitation  of  the  TMA  on  or  near  the  substrate  surface.  For  the  best 
dimensional  control  however,  it  appears  that  for  wavelengths  longer  than 
220  nm  will  offer  the  advantage  of  a  transparent  vapor  over  the  substrate. 

Phonons  in  Short  Period  (AlAsHInAs)  and  (GaAs)(InAs)  Superlattices 

The  use  of  strained  layer,  short  period  superlattices  has  been  considered  an 
attractive  alternative  for  conventional  lattice  matched  bulk  ternary  layers  in 
heterostructure  devices.  Possible  attributes  of  the  superlattice  approach  are 
lower  thermal  spreading  resistance  of  laser  cladding  layers  and  improved 
impurity  activation,  especially  in  n-type  high  bandgap  alloys. 

The  most  straight  forward  implementation  of  this  concept  has  been  using 
(AlAsXGaAs)  structures  in  heterostructure  lasers  emitting  in  the  IR  [5]  and 
the  visible  [6].  In  the  former  study,  graded  index  regions  are  synthesized  with 
graded  superlattice  periods  and  selectively  doped  cladding  layers  are  used.  The 
visible  laser  structures  have  benefited  from  superlattice  active  regions  as 
demonstrated  by  a  reduction  of  laser  threshold  currents  at  wavelengths  as 
short  as  680  nm. 

More  recently,  strained  layer  superlattices  latticed  matched  (nominally)  to  InP 
have  been  studies  in  the  (GaAsXInAs)  system  [7].  Improved  structural  and 
optical  properties  are  associated  with  structures  produced  by  Migration 
Enhanced  Epitaxy  (MEE).  Besides  their  potential  device  applications,  these 
short  period  strained  layer  structures  are  interesting  from  a  fundamental 
viewpoint.  Phonons  and  electrons  in  layered  media  display  the  effects  of  zone 
folding  and  quantum  confinement  which  are,  in  turn,  sensitive  indicators  of 
material  structure  and  quality.  Depending  on  the  nature  of  the  bulk 
dispersion  in  each  region  of  a  superlattice  the  resulting  vibrational  modes  are 
either  confined  or  propagative  as.  In  all  cases,  the  acoustic  branches  fold  in  the 


superlattice,  giving  rise  to  new  Raman  active  phonons  at  the  reduced  zone 
center.  These  modes  are  commonly  observed  to  be  as  sensitive  a  measure  of 
periodicity  as  X-ray  rocking  curves  of  TEM  diffraction  data.  It  is  interesting  to 
note  that  where  the  optical  branches  overlap,  folding  of  the  optic  branches 
occur  indicating  propagative  modes.  The  region  of  overlap  is  substantially 
increased  in  the  (GaAs)flnAs)  case  if  strain  corrections  are  included  in  a  ID 
linear  chain  calculation. 

Safety  Issues  Concerning  the  OMVPE  Process  Using  Hydrides 

A  few  significant  features  commonly  found  in  the  OMVPE  process  and 
related  facilities  are  described  here.  Also  the  latest  N.Y.  State  guidelines  (as  set 
by  their  Department  of  Environmental  Conservation  -  DEC)  for  err^sions 
into  the  environment  are  given  to  indicate  the  need  for  good  process  control 
from  the  arsine  source  to  the  exhaust  stack.  Most  of  the  information  given  is 
obtained  through  private  communications  and,  as  a  result,  is  not  referenced. 

Generally,  conventional  vented  gas  cabinets  are  used  to  house  the  hydride 
tanks  which  are  fitted  with  a  flow  limiting  orifice.  If  vented  gas  cabinets  are  to 
be  used,  dilution  exhaust  flow  required  to  meet  the  1/2  Immediately 
Dangerous  to  Human  Life  (IDLH)  toxicity  level  as  required  the  California 
building  code  are  over  35,000  and  300,000  cfm  for  arsine  cylinders  with  and 
without  a  standard  flow  limiting  orifice.  It  should  be  noted  that  no  arsine 
installation  in  the  U.S.  meets  the  1/2  IDLH  requirement  when  a  catastrophic 
cylinder  failure  occurs  and  few  can  handle  the  controlled  release  through  the 
orifice.  New  technology  is  needed  to  allow  the  OMVPE  technique  to  be  used 
in  production  environments  in  the  near  future.  Furthermore,  recent  changes 
in  the  DEC  code  in  N.Y.  State  are  more  stringent  than  the  California  code.  For 
example,  the  ambient  guide  line  concentrations  for  arsine  emissions  into  the 
environment  are  6.5(10*2)  and  7.4(10*5)  parts  per  billion  (ppb)  for  short  term 
release  (1  hour)  and  annual  averages,  respectively.  These  numbers  are  many 
orders  of  magnitude  less  than  the  TLV  value  of  50  ppb.  This  requires  that 
spills  be  contained  and  the  exhaust  from  reactors  be  treated  prior  to  release  up 
the  stack.  An  approach  taken  at  Cornell  will  be  described  in  the  discussion  of 
progress  below. 

It  is  worth  noting  that  the  use  of  ethyl-organometallic  sources  generally 
reduce  the  amount  of  hydride  consumption  by  as  much  as  2  orders  of 
magnitude  for  acceptable  quality  films.  As  a  result,  a  safer  OMVPE  process 
emerges  using  ethyl  sources.  For  example,  workers  have  reported  high  purity 
GaAs  grown  at  10  torr  using  triethylgallium  (TEG)  and  arsine  with  a  V/III 
ratio  of  unity  [8].  Recent  results  on  low  V/DI  ratios  used  in  GaAs  growth  will 
be  presented  below. 
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In  this  section,  progress  on  the  OMVPE  growth  and  characterization  of 
undoped  GaAs  is  presented,  as  well  as,  the  MEE  growth  and  characterization 
of  GalnAs  and  AlInAs  structures.  A  brief  update  on  the  status  of  the  selective 
growth  reactor  and  the  deep  UV  laser  system  will  follow.  Finally,  a  discussion 
of  the  the  secondary  hydride  containment  system  in  use  in  the  OMVPE 
facility  and  results  on  stack  testmg  are  presented. 

OMVPE  Growth  of  Undoped  GaAs 

The  operation  of  Cornell's  new  OMVPE  facility  began  in  late  1991.  A 
multichamber  reactor  [9]  is  currently  in  use  for  the  growth  of  GaAs  using  TEG 
and  arsine.  A  table  which  summarizes  the  first  9  growth  runs  of  undoped 
GaAs  is  given  below. 

Table  1.  Summaiy  of  the  growth  experiments  performed  on  undoped  GaAs  giving  the  growth 
parameters  and  measured  layer  thickness  (by  angle  bevel  and  stain).  The  TEG  flow  rate  is 
calculated  from  published  vapor  pressure  data  and  measured  pressure  and  flow  data. 


Growth  Temp. 
CC) 

Group  m  flow 
(scan) 

Group  V  flow 
(scan) 

V/III  ratio 

Pressure 

(torr) 

Thickness 

(mm) 

577 

0.9 

66 

75 

76 

1 

577 

2.6 

66 

25 

76 

2.46 

5/7 

2.6 

34 

13 

76 

3.14 

577 

2.6 

49 

19 

76 

6 

577 

2.6 

13 

5 

76 

3 

577 

2.6 

19.5 

75 

76 

3 

635 

2.6 

80 

30 

76 

4.5 

577 

2.6 

13 

5 

76 

3.5 

635 

2.6 

42 

20 

76 
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We  have  explored  the  best  growth  conditions  for  ultra-high  purity  layers  at 
low  arsine  flows  (alternatively  low  V/III  ratios).  The  layers  have  been 
characterized  using  77  K  and  room  temperature  hall  measurements, 
capacitance-voltage  (CV)  measurements,  and  low  temperature 
photoluminescence  (PL).  Several  interesting  features  have  emerged  from  this 
series  of  experiments  including  the  inter-related  effects  of  gas  phase 
stochiometry  and  growth  temperature.  At  sufficiently  high  growth 
temperatures  (>  600  °C),  the  normal  p/n  conversion  apparently  does  not 
occur,  which  is  attributed  to  the  total  lack  of  carbon  contamination  in  many 
films.  All  the  films  grown  above  600  °C  are  n-type  and  the  carrier 
concentration  is  typically  less  than  5(10^)  cm'3.  Mobilities  at  77  K  have  been 
measured  to  exceed  101,000  cm^/v  sec.  This  corresponds  to  a  total  ionized 
donor  and  acceptor  concentration  of  5(10^)  cm'3.  Low  temperature  PL  data 
for  several  films  grown  at  577  °C  are  presented  in  figure  2. 


Energy  (eV) 

Figure  2-  Low  temperature  (0.9  K)  photoluminescence  spectra  of  undoped  GaAs  grown  with  TEG 
and  arsine  at  577  °C  and  at  the  indicated  gas  phase  stochiometry.  A  spectrum  of  a  sample  grown 
using  TMG  and  arsine  at  650  °C  in  the  same  reactor  is  shown  for  comparison.  The  excitation 
conditions  and  experimental  resolution  are  indicated. 

The  most  striking  features  in  this  PL  data  is  the  lack  of  donor-carbon  acceptor 
pair  transitions  which  appear  near  1.495  eV  for  samples  grown  with  the  TEG 
source.  This  is  observed  even  at  the  low  V/m  ratios  meaning  the  ethyl  source 
requires  at  least  an  order  of  magnitude  less  hydride  consumption  for  high 
quality  films.  The  bound  excitons  and  free  excitons  transitions  are  readily 
resolved  at  photon  energies  near  1.515  eV  and  is  a  sensitive  indicator  of 
material  quality.  Finally  excitons  bound  to  neutral  acceptors  are  barely  visible 
in  this  spectral  range  indicating  low  acceptor  concentrations  are  present  in  the 
recently  grown  samples. 


OMVPE  Reactor  for  Deep  UV  Stimulated  Selective  Growth 

A  second  OMVPE  reactor  is  currently  nearing  the  end  of  its  construction  and 
will  be  used  to  perform  photo-stimulated  growth  using  a  coherent,  tunable 
laser  source.  The  reaction  cell  is  integrated  onto  an  optical  table  and  a 
piezoelectric  3  axis  drive  will  allow  scanning  of  a  focused  submicron  spot 
across  the  growth  surface.  We  have  tested  the  laser  performance  and  the 


results  look  encouraging.  When  the  dye  laser  is  pumped  at  25  Hz,  the  average 
power  of  the  second  harmonic  exceeds  80  mW  (1.7  mj/pulse)  and  is  tunable 
over  the  spectral  range  from  230  to  255  nm  with  peak  power  available  at  240 
nm.  This  performance  is  expected  to  allow  excitation  of  the  surface  reactants 
without  absorption  in  the  gas  phase  at  sufficient  energy  to  stimulate  the 
growth  over  1  cm^  areas.  This  projection  is  based  on  data  from  reference  2 
and  the  absorption  spectra  in  figure  1.  The  first  selective  growths  are  planned 
for  late  summer  in  1992. 

Test  Results  on  Arsine  Containment  and  Emissions 

The  arsine  cylinders  are  housed  in  a  secondary  containment  system  which 
has  fully  pressure  tested  at  the  extreme  limits  of  its  intended  use.  These 
containment  systems  (one  for  arsine,  one  for  phosphine)  are  now  able  to 
handle  a  catastrophic  cylinder  failure.  The  exhaust  from  this  containment 
system  and  from  die  reactor  is  passed  through  a  high  temperature  incinerator 
for  destruction  of  residual  arsine.  We  have  been  monitoring  arsine  emissions 
on  the  exhaust  stack  since  the  first  run  of  the  multichamber  OMVPE  reactor. 
The  level  of  detection  is  less  than  3  ppb  prior  to  dilution  and  dispersion  at  the 
top  of  the  stack.  After  the  first  several  runs  it  was  found  that  the  incinerator 
required  modifications  to  meet  emission  standards.  Oxygen /air  mixtures  are 
used  to  insure  complete  combustion  of  hydrogen  and  arsine.  The  procedures 
used  for  each  growth  experiment  were  modified  to  eliminate  "spikes"  of 
arsine  at  the  beginning  and  end  of  each  run.  We  (including  a  representative 
from  the  DEC)  are  able  now  to  observe  undetectable  emissions  throughout  a 
given  growth  run.  Given  the  10^  dilution  which  occurs  at  the  stack  and  the 
level  of  detection,  Cornell's  lab  meets  the  most  stringent  environmental 
protection  standards.  Some  aspects  of  the  design  are  currently  under 
consideration  for  a  patent  application. 

SCIENTIFIC  IMPACT  OF  RESEARCH 

The  scientific  impact  of  this  research  task  can  be  summarized  in  two  major 
points.  First,  an  OMVPE  facility  can  be  designed  and  implemented  to  insure  a 
minimal  level  of  risk  to  personnel,  the  general  public  and  the  environment. 
Furthermore,  using  certain  combinations  of  organometallic  precursors,  the 
amount  of  arsine  consumption  is  substantially  reduced.  The  PL  data  on  films 
grown  with  low  arsine  anf  TEG  show  the  absence  of  acceptors,  in  particular 
carbon  and  zinc,  which  is  commonly  found  in  OMVPE  materials.  When  used 
in  combination  with  a  new  A1  precursor,  trimethylamine  alane,  we  anticipate 
high  quality  A1  containing  alloys  can  be  grown  in  the  near  future.  Second,  the 
development  of  the  submicron  selective  OMVPE  growth  process,  using  a 
high  power  deep  UV  coherent  source,  will  potentially  revolutionize  the 
development  of  III-V  based  integrated  circuits  including  those  with 
optoelectronic  elements.  The  laser  system  is  commercially  available  and  can 
be  readily  integrated  into  an  OMVPE  reaction  cell  to  stimulate  the  reactions 
on  the  growth  surface  without  photo  decomposition  in  the  vapor  phase. 
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OBIECnVE 

The  objective  of  this  task  is  to  develop  new  femtosecond  sources  and 
measurement  techniques  and  to  use  such  sources  and  techniques  to  study 
ultrafast  processes  in  compound  semiconductors  and  related  structures.  On 
source  development,  current  emphasis  is  on  high  repetition  rate  all-solid- 
state  femtosecond  sources  and  in  extending  the  tuning  range  of  such  sources. 
On  optical  measurement  techniques,  current  emphasis  is  on  developing 
optical  sampling  techniques  with  femtosecond  time  resolution  based  the  up- 
conversion  processes.  These  sources  and  techniques  are  being  successfully 
applied  to  studying  the  relaxation  dynamics  of  non-equilibrium  carriers  in  HI- 
V  compounds  and  quantum  wells.  The  capture  problem  and  the  problem  of 
tunneling  of  coherent  wave  packets  in  quantum  wells  are  of  particular 
interest  at  the  present  time. 

DISCUSSION  OF  THE  STATE-OF-THE-ART 

Almost  all  the  work  on  femtosecond  optics  and  ultrafast  processes  in  the  past 
has  been  based  on  the  use  of  the  mode-locked  Rh6G  femtosecond  dye  laser  as 
the  primary  source  of  short  pulses  of  light.  The  trend  recently  has  been  to 
move  away  from  the  dye  lasers  to  all-solid-state  short  pulse  sources.  CW 
mode-locked  Ti-doped  sapphire  laser  has  been  most  widely  used  new  primary 
femtosecond  laser  source.  The  Ti:sapphire  laser  is  tunable  over  the  range  of 
720  nm  to  about  1  mm.  The  emphasis  of  our  work  has  been  to  extend  the 
useful  range  of  all-solid-state  femtosecond  lasers  to  beyond  this  range  through 
nonlinear  optical  techniques.  Very  significant  progress  has  been  made  in  this 
effort  during  the  past  year  and  the  results  are  discussed  below  in  the  Progress 
section. 

In  the  case  of  femtosecond  optical  measurement  techniques,  most  of  the  past 
studies  of  ultrafast  phenomena  have  been  based  upon  some  sort  of  pump- 
probe  measurement,  including  the  related  correlation  spectroscopic 
techniques.  All  these  techniques  suffer  from  the  fact  that  during  the  probing 
process,  the  system  being  measured  is  also  disturbed  to  some  extent.  To  avoid 
perturbing  the  system  being  measured,  the  time-resolved  hot  luminescence 
up-conversion  technique  has  been  developed  to  study  the  relaxation 
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dynamics  of  non-equilibrium  carriers  in  semiconductors  at  a  number  of 
laboratories  recently,  including  Cornell.  This  technique  allows  optical 
sampling  with  a  time  resolution  on  the  order  of  50  fs  of  the  very  weak  hot 
luminescence  emitted  by  the  carriers  during  the  relaxation  process.  This 
technique  has  now  been  well  developed  where  the  dark  noise  count  is  down 
to  half  a  photon  per  second  and  has  been  used  successfully  to  yield 
unambiguous  data  on  the  cooling  rates  of  hot  carriers  in  bulk  GaAs  and 
GaAs/AlGaAs  quantum  wells  at  high  carrier  densities.  An  earlier 
controversy  on  this  issue  is  thus  resolved  unambiguously  and  put  to  rest. 

The  relative  hot-electron  cooling  rates  for  bulk  and  quantum  well  (QW) 
structures  is  an  important,  basic  question  that  affects  many  applications  of 
quantum  well  structures  and  has  inspired  a  large  number  of  theoretical 
studies.  It  is  well  known  that  the  hot-carrier  cooling  rate  decreases  with 
increasing  carrier  concentration  in  both  QW  and  bulk  structures.  The  first 
study  comparing  GaAs/AlGaAs  QW's  and  bulk  GaAs  reported  similar 
cooling  rates  at  a  carrier  density  (n)  of  2.5xl017  cm'8  [1].  Subsequent  studies 
reported  a  much  slower  cooling  rate  in  QW's  than  in  the  bulk  at  higher 
carrier  densities  (n>10l8cm'8)  [2-5].  Nevertheless,  in  a  number  of  recent 
publications  Leo  et  al.  [6,  7]  have  cited  these  previous  results  as  being 
contradictory  and  concluded  that  the  cooling  rates  of  bulk  GaAs  and 
GaAs/AlGaAs  QW's  are  equivalent.  These  comparisons,  however,  were 
based  on  comparisons  limited  to  the  carrier  density  range  of  10^5<n<1018 
cm'8,  generalizing  that  the  independence  of  carrier  cooling  with 
dimensionality  is  also  independent  of  carrier  density.  The  former  studies  [2-5] 
were  carried  out  using  nonlinear  intensity  correlation  spectroscopy.  The 
latter  study  was  carried  out  using  a  time-resolved  streak-camera  with  a  time 
resolution  of  -20  ps.  Using  the  more  accurate  time-resolved  hot- 
luminescence  spectroscopic  technique  we  have  now  shown  conclusively  [8] 
that  the  cooling  rates  of  the  quantum-well  structures  are  significantly  slower 
than  that  of  the  bulk  for  n^5xl0^8  cm'8  and  similar  at  2xl0^8cm'8  or  lower 
thus  confirming  the  earlier  observations  [2-5]  of  the  difference  in  the  cooling 
rates  between  the  bulk  and  quantum  well  structures.  This  difference  could 
not  have  been  seen  in  the  carrier  density  range  studied  by  Leo  et  al.  [7,  8]. 

PROGRESS 

High  repetition  rate  femtosecond  pulse  generation  in  the  blue  [9]  -  We  have 
succeeded  in  generating  very  high  average  powers  of  high  repetition  rate 
femtosecond  pulses  in  the  blue  for  the  first  time  by  intracavity  doubling  of  a 
mode-locked  Ti:sapphire  laser  using  b-BaB2C>4  (BBO).  To  reduce  the  pulse 
broadening  effect  of  group  velocity  mismatch,  an  extremely  thin  BBO  crystal 
is  used.  Pumping  the  Ti:sapphire  laser  with  4.4  W  from  an  Ar+  laser,  up  to 
230  mW  of  430  nm  light  is  produced  at  72  MHz  repetition  rate  and  89  fs 
pulsewidth.  This  represents  an  effective  conversion  efficiency  of  -75%  from 
the  typical  infrared  output  to  the  second  harmonic.  Pulse  widths  as  short  as 
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54  fs  are  achieved  for  the  blue  output.  Recent  conversion  to  a  higher  power 
Ar+  laser  pump  has  led  to  blue  output  in  the  range  of  400  mW.  We  expect 
eventually  to  reach  the  1  W  level.  When  this  is  reached,  we  expect  to  be  able 
to  generate  femtosecond  uv  pulses  near  200  nm  at  a  substantial  power  level, 
which  should  open  up  many  ultrafast  processes  in  a  variety  of  materials  and 
structures  for  study  in  the  femtosecond  time  domain.  Work  is  also  under 
way  to  achieve  extra-cavity  pumping  of  the  femtosecond  OPO  by  the 
Ti:sapphire  laser.  This  should  make  it  much  easier  for  others  to  operate  the  fs 
OPO's  than  in  the  case  of  intracavity  pumped  OPO  demonstrated  by  us  earlier. 
1  is  should  open  up  the  broad  mid  ir  range  up  to  4.5  mm  for  studies  in  the 
femtosecond  time  domain  by  many  laboratories. 

Hot  luminescence  spectroscopy  [9]  -  A  very  sensitive  time-resolved 
upconversion  spectroscopic  setup  has  been  built  in  our  laboratory.  This  time- 
resolved  spectrometer  has  a  time  resolution  on  the  order  of  50  fs,  spectral 
rang'1  from  approximately  400  nm  to  2.5  mm,  and  noise  of  0.5  Hz.  It  has  been 
successfully  used  to  study  the  relaxation  dynamics  of  hot  carriers  in  III-V 
compounds  and  will  be  applied  to  the  quantum  well  carrier-capture  problem 
and  the  coherent  electron-wavepacket  tunneling  problem. 

Comparison  of  hot-carrier  relaxation  in  quantum  wells  and  bulk  GaAs  at 
high  carrier  densities  [8]  -  An  investigation  of  the  hot-carrier  relaxation  in 
GaAs/(Al,Ga)As  quantum  wells  and  bulk  GaAs  in  the  high-carrier-density 
limit  is  completed.  Using  a  time-resolved  luminescence  up-conversion 
technique  with  £  80  fs  temporal  resolution,  carrier  temperatures  are 
measured  in  the  100  fs  to  2  nsec  range.  Our  results  show  that  the  hot-carrier 
cooling  rates  in  the  quantum  wells  are  significantly  slower  than  in  the  bulk 
for  carrier  densities  greater  than  2x10*8  cm"3.  A  comparison  is  made  with 
previous  publications  to  resolve  the  confusion  concerning  the  difference  in 
cooling  rates  in  quasi-two  and  three  dimensional  systems. 

SCIENTIFIC  IMPACT  OF  RESEARCH 

The  femtosecond  sources  and  measurement  techniques  developed  should  be 
of  great  use  to  others  in  the  scientific  community.  The  results  obtained  on 
the  dynamics  of  nonequilibrium  carriers  in  III-V  compounds  and  structures 
are  of  fundamental  importance  to  the  understanding  of  the  physics  and  the 
design  of  ultra-high  speed  semiconductor  electronic  and  optical  devices. 

DEGREES  AWARDED 

1.  E.  S.  Wachman 

"Ultrafast  spectroscopy  with  a  novel  broadly  tunable  cw  femtosecond 

source" 

PhD,  Applied  Physics,  1991. 


2.  W.  H.  Loh 


"Polarization  Self-Modulation  in  Semiconductor  Lasers" 

PhD,  Electrical  Engineering,  August,  1991. 

3.  Y.Ozeki 

"Study  of  two-mode  optical  bistable  semiconductor  laser  diodes  with  intra- 
cavity  saturable  absorbers" 

PhD,  Electrical  Engineering,  August,  1991. 

REFERENCES 

[1]  C.  V.  Shank,  R.  L.  Fork,  R.  Yen,  J.  Shah,  B.  I.  Greene,  A.  C.  Gossard,  and  C. 
Weisbuch,  Solid  State  Commun.  47,  981  (1983). 

[2]  Z.  Y.  Xu  and  C.  L.  Tang,  App.  Phys.  Lett .  44,  692(1984). 

[3]  H.  Ukichi,  Y.  Arakawa,  H.  Sakaki,  and  T.  Kobayashi,  Solid  State  Comm. 
55, 311  (1985). 

[4]  S.  A.  Lyon,  J.  Lumm.  35, 121  (1986). 

[51  A.  J.  Nozik,  C.  A.  Parsons,  D.  J.  Dunlavy,  B.  M.  Keyes,  and  R.  K. 
Ahrenkiel,  Solid  State  Comm.  75,  297  (1990). 

[6]  K.  Leo,  W.  W.  Ruhl,  and  K.  Ploog,  Phys.  Rev.  B  38,  1947(1988);  Solid.  State 
Elect.  32, 1863  (1989). 

[7\  K.  Leo,  W.  W.  Ruhl,  H.  J.  Queisser,  and  K.  Ploog,  App.  Phys.  A  45, 
35(1988);  Phys.  Rev.  B  37,  7121  (1988). 

[8]  W.  S.  Pelouch,  R.  J.  Ellingson,  P.  E.  Powers,  C.  L.  Tang,  D.  M.  Szmyd,  and 
A.  J.  Nozik,  Phys.  Rev.  B  45,  1450(15  January,  1992);  7th  International 
Conference  on  Hot  Carriers  Conference  in  Semiconductors,  Nara,  Japan, 
(July,  1991). 

[9]  "High  Repetition  Rate  Femtosecond  Pulse  Generation  in  the  Blue,"  R.  J. 
Ellingson  and  C.  L.  Tang,  Optics  Letters  (scheduled  for  March,  1992);  also 
to  be  presented  at  CLEO  '92,  Anaheim,  CA  (  May,  1992). 

ISEP  PUBLICATIONS  AND  TALKS 

1.  "Broadly  tunable  cw  femtosecond  optical  parametric  oscillators,”  C.  L. 
Tang,  W.  Pelouch,  and  P.  Powers,  invited  talk,  CLEO  '91,  Baltimore,  MD 
(May,  1991). 

2.  "Polarization  bistability  in  semiconductor  lasers,"  C.  L.  Tang,  Y.  Ozeki, 
and  J.  Johnson,  CLEO  '91,  Baltimore,  MD  (May,  1991). 


17 


3.  "Femtosecond  optical  parametric  oscillators,"  C.  L.  Tang,  invited  talk, 
American  Physical  Society  March  Meeting,  Cincinnati,  OH  (March  18-21, 
1991). 

4.  "Femtosecond  optics,"  International  Workshop  on  Lasers  in  Chemistry 
and  Physics,  Dalian,  China,  sponsored  by  UNESCO  and  the  Chinese 
Academy  of  Sciences  (May  22-28, 1991). 

5.  "CW  femtosecond  pulses  tunable  in  the  near-  and  mid-infrared,"  E.  W. 
Wachman,  W.  S.  Pelouch,  and  C.  L.  Tang,  }.  App.  Phys.,  70,  1893 
(1  August,  1991). 

6.  "Comparison  of  hot-carrier  relaxation  in  quantum  wells  and  bulk  GaAs 
at  high  carriers  densities,"  W.  S.  Pelouch,  R.  J.  Ellingson,  P.  E.  Powers, 
C.  L.  Tang,  D.  M.  Szmyd,  and  A.  J.  Nozik,  Phys.  Rev.  B  45,  1450  (15 
January,  1992). 

7.  "Investigation  of  hot  carrier  relaxation  in  quantum  well  and  bulk  GaAs 
at  high  carrier  densities,"  W.  S.  Pelouch,  R.  J.  Ellingson,  P.  E.  Powers,  C. 
L.  Tang,  D.  M.  Szmyd,  and  A.  J.  Nozik,  Proceedings  of  7th  International 
Conference  on  Hot-Carriers  in  Semiconductors,  Nara,  Japan  (July  1991). 

8.  "High  Repetition  Rate  Femtosecond  Pulse  Generation  in  the  Blue,"  R.  J. 
Ellingson  and  C.  L.  Tang,  Optics  Letters  (scheduled  for  March,  1992);  also 
to  be  presented  at  CLEO  ’92,  Anaheim,  CA(  May,  1992). 

9.  "Polarization  bistability  in  semiconductor  lasers  with  intracavity 
multiple  quantum  well  saturable  absorbers",  Y.  Ozeki,  J.  E.  Johnson,  and 
C.  L.  Tang,  App.  Phys.  Lett.,  58, 1958  (1991). 

10.  "Polarization  switching  and  bistability  in  an  external  cavity  laser  with  a 
polarization-sensitive  saturable  absorber,"  Y.  Ozeki  and  C.  L.  Tang,  Appl. 
Phys.  Lett.,  58,  2214  (1991). 

11.  "Dynamics  of  hot  carriers  in  quantum  wells  and  bulk  GaAs  at  high 
carrier  densities:  femtoseconds  to  naoseconds,"  invited  talk,  SPIE 
Conference  on  Ultrafast  Laser  Probe  Phenomena  in  Bulk  and 
Microstructures  in  Semiconductors  and  Superconductors,  Somerset,  NJ 
(22-26  March,  1992). 

12.  "Optical  parametric  oscillators,"  C.  L.  Tang,  invited  talk.  International 
School  on  Optics  and  Optical  Physics,  Capri,  Italy  (1992). 


ULTRAFAST  INTERACTIONS  OF  CARRIERS  AND  PHONONS  IN 
NARROW  BANDGAP  SEMICONDUCTOR  STRUCTURES 


Task  #3 
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(607)  255-5032 


OBJECTIVE 

Our  goal  is  to  provide  experimental  data  that  can  be  used  to  confirm  or  refute 
the  current  models  used  to  describe  hot  electron  relaxation  in  narrow 
bandgap  semiconductors.  In  this  manner,  we  are  measuring  the  carrier 
relaxation  in  various  samples  and  providing  the  results  of  the  measurements 
to  Prof.  Krusius  for  comparison  with  his  groups  simulations  of  the  scattering 
process. 

DISCUSSION  QF  STATE-OF-THE-ART 

Most  of  the  ultrafast  work  in  InGaAs  has  been  on  quantum  well  structures, 
such  as  laser  diode  amplifiers.  One  motivation  for  doing  femtosecond  studies 
of  diode  amplifiers  is  "gain  compression",  which  refers  to  a  decrease  in  gain  at 
high  modulation  rates.  Most  pump-probe  work  on  amplifiers  attempts  to 
characterize  this  effect.  Working  in  a  semiclassical  picture  (carrier/photon 
rate  equations),  one  can  model  the  recovery.  The  debate  in  the  field  is  about 
the  importance  of  different  forms  of  carrier  relaxation.  Spectral  hole  burning 
has  been  suggested  to  explain  the  transient  gain  decrease.  Others  advocate 
carrier  heating:  carrier  transitions  occur  away  from  the  bandgap  due  to  the 
wideband  nature  of  the  femtosecond  pulses,  while  stimulated  emission 
occurs  preferentially  at  the  bandgap  [1].  The  effect  occurs  because  carriers  need 
to  cool  down  to  reach  the  bandgap  and  be  more  readily  available  for 
stimulated  emission. 

Recently,  a  new  candidate  for  the  gain  nonlinearity  has  been  proposed.  Weiss 
[2]  considered  an  InGaAs /InGaAsP  MQW  structure  similar  to  Hall's  paper 
above.  They  proposed  that  spatial  transport  of  carriers  from  the  boundary 
layers  cladding  the  MQW  structure.  The  experiments  appear  identical,  the 
difference  lies  in  the  models  used  to  fit  the  data  by  the  two  groups.  Hall  uses 
rate  equations  with  a  phenomenological  heating  term,  while  Weiss  fits  the 
gain  decay  rate  to  a  carrier  diffusion  model.  A  third  amplifier  gain 
compression  paper  deals  with  the  same  issues  [3]. 

Carruthers  [4]  at  NRL  examined  a  vertical  stack  heterojunction  bipolar 
transistor.  He  used  a  modelocked  Er- fiber  laser  to  generate  1.6  ps  pulses  at 


1.53um.  Shining  the  light  on  the  emitter  contact  layer  (which  was  the  InGaAs 
layer  that  absorbed  most  of  the  light  and  injected  photocarriers  into  the 
InP/ InGaAs /InP  transistor  structure  below  ),  with  the  collector  grounded,  the 
emitter  photocurrent  was  observed  with  a  40GHz  scope.  This  revealed  a  12ps 
electrical  pulse.  The  author  attributes  the  width  to  RC  effects,  but  the 
experiment  in  principle  could  measure  transit  times. 

Knox  [5]  laid  down  parallel  stripline  on  top  of  the  quantum  well.  A  pump 
beam  at  the  appropriate  energy  to  generated  excitons  which  were  rapidly 
ionized  by  an  applied  electric  field.  The  electrical  transient  this  generated 
propagated  rapidly  inside  the  stripline.  Because  the  exciton  creation  time  is 
very  fast,  and  the  exciton  absorption  line  is  Stark  shifted  by  an  E  field,  the 
exciton  absorption  strength  was  a  sensitive  measure  of  the  field  inside  the 
stripline.  Knox  used  a  pump  at  one  end  of  the  line  to  generate  an  initial 
phototransient,  and  used  a  probe  beam  at  the  same  energy  but  displaced  along 
the  line  to  measure  the  electric  field  strength,  with  100  fs  time  resolution  and 
lum  spatial  resolution.  He  observed  signals  propagating  near  the  speed  of 
light  (since  the  QW  was  very  thin,  with  no  substrate,  the  electrical  group 
velocity  is  extremely  large  in  these  structures). 

ERQGBESS 

We  found  through  comparison  of  data  to  simulation  that  an  experimental 
accuracy  and  repeatability  of  1  part  in  1000  was  required.  This  is  not  simply  a 
Signal /Noise  issue,  but  a  problem  of  removing  all  systematic  errors  from  the 
data  to  this  level.  The  search  for  systematic  errors  has  been  the  thrust  of  most 
of  our  work  in  the  past  year. 

Much  of  the  error  in  our  signals  was  traced  to  the  variable  delay  arm  of  the 
spectrometer.  We  found  that  our  data  was  slightly  asymmetric,  at  least  on  the 
scale  of  parts  per  thousand.  The  source  of  the  asymmetry  was  found  to  lie  in 
a  slow  creep  of  the  galvonometer  that  was  used  to  scan  the  retroreflecting 
mirror  of  the  delay  arm.  In  response,  we  improved  the  galvanometer-driven 
optical  delay  device  by  using  a  lightweight  translation  stage  to  insure 
linearity,  and  by  using  position  sensitive  feedback  to  monitor  and  control  the 
galvo  itself. 

We  have  evaluated  several  different  noise  reduction  techniques.  We  found 
that  the  weak  reflection  of  the  pump  laser  from  our  femtosecond  laser  led  to 
amplitude  instabilities  in  the  output  power,  and  this  caused  excess  noise  in 
our  signal.  We  implemented  a  new  feedback  amplitude  stabilizer  using 
acousto-optic  (AO)  modulation  of  the  pump  for  our  ultrafast  laser.  The  AO 
cell  both  amplitude  stabilizes  the  pump  beam,  and  further  it  acts  as  an  optical 
isolator  between  the  two  systems,  reducing  the  deleterious  feedback.  In 
addition,  we  have  refined  the  existing  feedback  control  of  cavity  length. 
Finally,  we  have  mapped  out  a  larger  range  of  operating  parameters  of  the 
APM  laser,  allowing  us  to  generate  a  more  stable  train  of  femtosecond  pulses. 


Our  data  shows  the  recovery  time  of  the  measured  transmission  fits  a  single 
exponential  decay,  which  is  a  simplification  of  the  process,  but  describes  the 
data  very  well.  Data  is  being  collected  as  a  function  of  wavelength,  and  we  are 
just  setting  up  a  temperature  stage  to  measure  the  effects  of  increased 
temperature  on  the  scattering  rates. 


SQE 


cans 


C  IMPACT  OF  RESEARCH 


These  measurements  will  provide  a  test  of  the  models  used  by  designers  to 
simulate  the  performance  of  InGaAs  devices.  They  will  be  invaluable  for 
refining  and  improving  present  theory  and  simulation  of  hot  carriers  in 
narrow  bandgap  materials. 
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OBJECTIVE 


•  The  objective  in  this  work  unit  is  to  explore  non-equilibrium  carrier 
processes  governing  electron  and  hole  transport  and  optical  interactions  in 
inhomogeneous  compound  semiconductor  heterostructures  theoretically. 
Electron  and  hole  interactions  among  themselves,  the  semiconductor  lattice, 
optical  fields,  and  external  electric  fields  are  described  using  self-consistent 

•  ensemble  Monte  Carlo  formulations  including  quantum  well  phenomena. 
This  work  is  done  collaboratively  with  femtosecond  optical  measurements 
and  materials  growth  efforts.  In  the  area  of  femtosecond  optical  probing  with 
tunable  lasers,  joint  work  is  performed  together  with  C.  Pollock's  research 
group  in  order  to  design  samples,  optical  experiments,  analyze  measured  data, 

•  and  extract  microscopic  information  of  femtosecond  carrier  processes.  Two 
PhD  graduate  students,  J.E.  Bair  and  S.  Weinzierl,  have  worked  on  this  task  in 
addition  to  the  principal  investigator. 

DISCUSSION  OF  STATE-OF-THE-ART 


A  variety  of  ballistic  injection  cathodes  have  been  built  into  the  structure  of 
compound  semiconductor  devices  in  order  to  enhance  their  high  speed 
performance.  These  include  the  n+/n  homojunction  [1],  the  p+/n 
heterojunction  [2],  the  abrupt  heterojunction  [3],  and  the  Schottky  barrier  [4].  It 
#  has  been  experimentally  demonstrated  that  these  injection  cathodes  can 

generate  a  significant  quasi-ballistic  electron  fraction  in  compound 
semiconductor  devices,  but  despite  such  observations  there  is  no  conclusive 
evidence  on  whether  such  injection  structures  substantially  increase  device 
performance.  The  first  published  observation  of  the  probing  of  the  non- 
^  equilibrium  electron  distribution  was  made  on  the  planar  doped  barrier 

transistor  (PDBT),  which  was  used  as  a  hot  electron  spectrometer  [5].  Another 
device  structure  used  at  about  the  same  time  was  the  tunneling  hot  electron 
transfer  amplifier  (THETA),  first  in  its  vertical  form  [6]  and  later  in  its  lateral 
form  [7].  Several  attempts  to  fabricate  unipolar  FET  devices  with  hot  electron 
f  cathodes  to  generate  ballistic  electrons  were  also  made  but  results  were 

ambivalent  [8,  9].  However,  because  of  the  complexity  of  designing  and 
fabricating  three  terminal  devices  with  hot  electron  cathodes,  it  was  not  clear 
whether  fabricated  devices  suffered  from  materials  growth  or  processing 
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related  non-idealities,  incorrect  device  designs,  or  the  insignificant  impact  of 
ballistic  electron  processes  on  the  terminal  characteristics  of  devices.  In  order 
to  resolve  these  fundamental  questions  it  is  necessary  to  explore  the 
microscopic  physics  of  the  transport  in  such  devices  via  sophisticated 
simulation  techniques,  which  can  resolve  the  underlying  microscopic 
processes,  including  non-equilibrium  carrier  transport  in  inhomogeneous 
devices.  Consequently  a  two-dimensional  time-dependent  self-consistent 
ensemble  Monte  Carlo  method  has  been  developed  by  this  group  and  used  to 
examine  the  impact  of  the  above  and  many  other  hot  carrier  processes  on  the 
terminal  characteristics. 

The  relaxation  of  carriers  excited  by  ultra-short  optical  pulses  has  been 
intensely  studied  both  experimentally  and  theoretically  for  several  years. 
Despite  this  a  full  understanding  of  the  complex  carrier  dynamics  in  these 
highly  non-equilibrium  situations  is  still  lacking.  Current  Monte  Carlo 
models  of  carrier  relaxation  have  achieved  considerable  success  in  explaining 
many  qualitative  features  of  experimental  observations  [10,  11].  However,  a 
great  deal  of  uncertainty  remains.  At  the  heart  of  these  uncertainties  lies  the 
role  of  carrier-carrier  scattering.  Models  to  this  point  have  almost 
unanimously  assumed  that  free  carrier  screening  can  be  adequately  described 
by  using  long  wavelength  static  approximations.  However,  recent  results 
have  indicated  carrier-carrier  scattering  may  be  seriously  underestimated  in  a 
static  screening  limit  [12,  13].  Theoretical  calculations  also  have  shown  that 
carrier-carrier  scattering  rates  are  significantly  enhanced,  if  dynamic  screening 
effects  are  taken  into  account  [14, 15].  Since  the  two  most  important  scattering 
processes  in  compound  semiconductors,  polar  optic  phonon  scattering  and 
carrier-carrier  scattering,  are  both  heavily  dependent  on  free  carrier  screening, 
this  issue  is  critical  in  the  understanding  the  role  of  these  scattering 
mechanisms.  Significant  progress  must  be  made  in  this  area  before  the  details 
cf  femtosecond  optical  experiments  can  be  understood.  Progress  in  this  area 
has  recently  been  made  through  a  joint  ensemble  Monte  Carlo/molecular 
dynamics  approach  [16]  which  has  some  success  in  correlating  with  measured 
data.  However,  this  method  appears  limited  to  homogenous  systems  due  to 
limitations  arising  from  the  size  of  the  area  that  can  be  simulated.  Thus  it  is 
unlikely  that  this  method  can  be  applied  more  widely  to  the  modeling  of 
other  highly  non-equilibrium  phenomena,  such  as  is  found  in  state  of  the 
high  speed  compound  semiconductor  devices. 

PROGRESS 

Non-Equilibrium  Carrier  Transport 

Work  in  this  subtask  has  built  on  our  past  efforts  with  progress  made  in  the 
following  areas:  dual  carrier  transport  formulation  and  software 
development,  and  understanding  of  carrier  launching  across  electron 
launchers  subject  to  interactions  with  two-dimensional  space  charges. 


The  bipolar  part  of  our  previous  software  package,  OPTMC  developed  by  J.E. 
Bair,  has  been  incorporated  into  the  existing  2DMC  transport  code.  The  new 
code,  M2EDUSA,  for  Multi-dimensional  Monte  carlo  Ensemble  Simulation 
for  Detailed  Unipolar  and  bipolar  heterojunction  Semiconductor  device 
Analysis,  implements  a  two  conduction  band  k.p  band  valleys,  T-L,  and  two 
hole  bands,  T,  including  warping.  All  the  usual  scattering  mechanisms  are 
included.  Any  number  of  ternary  compound  layers,  with  either  donor  or 
acceptor  doping,  can  be  specified  along  one  spatial  dimension.  A  material 
homogeneity  is  assumed  in  the  lateral  direction.  Either  rectangular  or  mesa 
type  devices  can  be  simulated,  and  ohmic  or  Schottky  contacts  of  any  length 
can  be  placed  anywhere  around  the  two-dimensional  periphery.  Quantum 
mechanical  reflection  due  to  the  potential  step  at  the  abrupt  heterojunctions 
is  included  to  first  order.  The  entire  implementation  was  constructed  with 
the  goal  of  having  the  simulations  run  as  accurately  as  possible  in  a 
workstation  computing  environment  without  the  use  of  supercomputing 
resources.  This  required,  for  example,  the  of  analytic  energy  bands  in  order  to 
decrease  run  time  memory  requirements  and  the  exclusion  of  carrier 
scattering  in  order  to  keep  execution  times  acceptable.  A  typical  M2EDUSA 
run  for  a  vertical  FET  with  an  abrupt  heterojunction,  zero  applied  bias,  charge 
neutral  initial  state,  256x64  point  spatial  mesh,  50000  particles,  and  1000  time 
steps  takes  about  8:45,  2:54,  and  2:26  hrs:min  on  an  HP9000/380,  a  DEC 
5000/200,  and  an  HP  9000/720  engineering  workstation  respectively.  As  a 
calibration,  the  HP-RISC  station  9000/720  is  37.4  times  faster  than  the  DEC 
VAX  11/780,  an  early  1980's  CISC  computer,  which  has  often  been  used  a 
benchmark. 

M2EDUSA  has  been  calibrated  and  verified  against  physics  principles,  other 
simulators,  and  experiments.  The  simulated  velocity  field  curve  for  GaAs 
matched  with  measured  data  within  the  experimental  error.  M2EDUSA 
accurately  resolves  the  transient  phenomenon  of  the  formation  and 
propagation  of  a  stable  Gunn  domain.  M2EDUSA  also  accurately  models  the 
dynamics  of  carrier  heating  and  cooling,  the  time-dependence  of  the  electron 
phonon  interaction  and  the  average  momentum  relaxation  time.  The 
correlation  of  M2EDUSA  simulated  current-voltage  characteristics  with 
measured  device  data  for  a  vertical  FET  with  an  imbedded  heterojunction 
shows  an  excellent  overall  agreement.  M2EDUSA  wrs  also  used  to  simulate 
significant  characteristics  of  the  homojunction  pn  diode  and  the 
homojunction  bipolar  junction  transistor  with  a  direct  comparison  with  the 
drift-and-diffusion  simulator  PISCES-II  with  excellent  results.  Further 
verification  and  calibration  studies  are  documented  in  detail  in  S.  Weinzierl's 
PhD  thesis  (see  degrees  awarded). 

To  date  M2EDUSA  has  been  used  to  simulate  the  following  unipolar  and 
bipolar  compound  semiconductor  devices:  the  abrupt  vertical  field  effect 
transistor  with  an  abrupt  heterojunction  launcher,  HJ-VFET;  the  modified 
planar  doped  barrier  vertical  FET  transistor  (PBD-VFET);  and  the  ballistic 


heterojunction  bipolar  transistor  (BHBT).  As  an  example.  Figure  1  shows  the 
simulated  average  electron  drift  velocity  in  an  npn  AlGaAs/GaAs  BHBT 
device  for  three  different  base  doping  levels.  The  unipolar  devices  have  been 
analyzed  to  the  end  resulting  in  a  full  understanding  of  device  operation  and 
the  role  of  imbedded  ballistic  electron  injectors.  The  bottom  line  is  that 
ballistic  electron  injectors  indeed  do  enhance  device  performance,  measured 
e.g.  with  the  cut-off  frequency  and  current  drive  capability,  but  only  if  the 
device  has  been  designed  correctly.  The  design  is  tricky  because  many 
tradeoffs  have  to  be  considered  simultaneously,  a  task  rather  impossible 
without  the  microscopic  insight  and  optimization  provided  by  M2EDUSA. 
Although  only  preliminary  results  are  available  for  the  BHBT  at  this  time, 
the  same  conclusion  seem  to  apply. 


Figure  1.  Simulated  steady  state  average  electron  drift  velocity  for  npn 
AlGaAs/GaAs  +BHBT  device  as  a  function  of  position  at  300K.  Three  base 
doping  levels  are  given.  The  applied  voltages  as  Vce  =  +10V  and  Vbe  = 
+0.1V  not  including  the  built  in  junction  potentials. 


Femtosecond  Optical  Interactions 

Femtosecond  thermalization  of  optically  excited  carriers  in  thin  films  has 
been  explored  using  a  self-consistent  Monte  Carlo  technique  fully  modeling 
the  interaction  of  conduction  band  electrons  and  light  hole  and  heavy  holes 
with  femtosecond  optical  pulses.  The  electron  and  hole  bands  are  described 
using  a  k.p  formulation,  with  corrections  for  higher  bands  included  through 
second  order  perturbation  theory.  The  interaction  of  optical  pulses  with  the 
evolving  carrier  distribution  function  is  handled  self-consistently  through 
Fermi's  golden  rule.  All  important  scattering  mechanisms  are  included.  Polar 
optic  phonon  scattering  and  carrier-carrier  scattering  are  screened  using  a 
static  screening  model  calculated  self-consistently  from  the  distribution 
function.  Carrier  transport  along  the  normal  to  the  thin  film  is  included 
through  a  self-consistently  calculated  electric  field. 

The  dependence  of  ultrafast  carrier  relaxation  on  each  of  the  carrier  scattering 
processes  and  on  several  experimental  parameters  has  been  investigated. 
The  latter  included  the  frequency  and  intensity  of  the  excitation  pulse,  as  well 
as  the  temporal  width  of  the  excitation  and  probe  pulses,  and  the  film 
thickness.  Optical  phonon  scattering  was  found  to  dominate  the  relaxation 
processes,  with  carrier-carrier  scattering  of  secondary  importance  and  other 
scattering  processes  playing  only  a  minor  role.  For  carrier-carrier  scattering 
this  was  contrary  to  expectations,  since  carrier-carrier  scattering  is  thought  to 
be  a  dominant  process  at  the  carrier  densities  in  question.  In  this  model 
carrier-carrier  scattering  is  largely  suppressed  by  free  carrier  screening  due  to 
the  large  carrier  densities  excited  by  the  pulse. 

Transmission  data  from  pulse-probe  experiments  were  found  to  depend  on 
several  experimental  parameters.  The  energy  of  the  exciting  photons  was 
found  to  be  particularly  important,  since  the  simulated  relaxation  times 
significantly  decrease  with  increasing  photon  energy.  This  is  largely  due  to  a 
step  at  the  first  electron-phonon  threshold,  and  correlates  closely  with  an 
increase  in  electron-optical  phonon  scattering  rate  at  that  photon  energy. 
Conversely,  simulated  relaxation  times  were  found  to  increase  with  the 
intensity  of  the  excitation  pulse.  This  is  due  to  the  suppression  of  both  the 
optical  phonon  and  carrier-carrier  scattering  rates  resulting  from  increased 
free  carrier  screening  and  degeneracy  as  the  carrier  density  increases.  A 
complex  relationship  between  the  widths  of  the  excitation  and  probe  pulses 
and  overall  relaxation  times  was  observed.  Also  a  significant  decrease  in  the 
fitted  relaxation  times  with  increasing  sample  thickness  was  found  for 
samples  with  thickness  equal  or  greater  than  the  optical  absorption  length. 


Due  to  the  critical  role  screening  plays  in  femtosecond  experiments,  it  was 
concluded  that  the  consequences  of  the  approximations  made  in  the  static 
screening  model  needed  to  be  examined  in  greater  detail.  To  this  end  a  new 
more  accurate  dynamic  screening  model  has  been  developed  and 


implemented  in  our  Monte  Carlo  code.  The  new  model  is  derived  from  the 
Lindhard  dielectric  function  and  fully  incorporates  the  energy  dependence  of 
the  free  carrier  screening.  The  use  of  an  approximate  parabolic  band  structure 
and  neglecting  anisotropy  for  the  bands,  carrier  distributions,  and  dielectric 
function  are  the  only  simplifications  made.  Improvements  are  being 
considered  to  incorporate  a  more  accurate  band  structure  with  anisotropy  into 
the  model. 

The  new  dynamic  screening  model  resulted  in  a  dramatic  change  in  the  effect 
of  free  carrier  screening.  Polar  optical  phonon  scattering  appears  largely 
unscreened  and  carrier-carrier  scattering  rates  are  substantially  enhanced. 
Also,  several  unexpected  features  of  the  highly  non-equilibrium  dielectric 
function  result  in  a  spectacular  enhancement  in  the  carrier-carrier  scattering 
of  electrons  and  light  holes  at  early  times.  The  result  is  a  model  of  carrier 
relaxation,  for  which  relaxation  times  are  substantially  shorter  than  those 
obtained  using  static  screening,  and  where  carrier-carrier  scattering  plays  a 
much  enhanced  and  perhaps  dominant  role. 

Investigations  are  currently  underway  to  determine  the  effects  of  photon 
energy,  excitation  pulse  intensity  and  the  role  of  each  scattering  mechanism 
using  the  dynamic  screening  model  These  results  will  then  be  compared  to 
those  obtained  using  static  screening.  Preliminary  results  for  the  effect  of 
photon  energy  dependence  indicate  not  only  substantially  reduced  relaxation 
times  for  dynamic  screening,  but  a  qualitatively  different  relationship 
between  photon  energy  and  relaxation  time  in  the  two  cases.  This  facilitates 
the  correlation  of  both  models  with  the  experiment  and  the  rejection  of  one 
model  over  the  other.  Further  investigations  on  doped  samples  are  being 
planned. 

SCIENTIFIC  IMPACT  OF  RESEARCH 

Non-equilibrium  carrier  transport  and  optical  interactions  in  high  speed 
electronic  and  optoelectronic  device  structures  are  examined  in  this  work 
unit  using  time-dependent  self-consistent  ensemble  Monte  Carlo  particle 
simulation  techniques.  We  have  now  completed  the  simulation  software 
development  for  two-dimensional  unipolar  and  bipolar  heterojunction 
devices  with  graded  structures  and  built-in  heterojunctions.  Microscopic 
aspects  of  high  speed  electron  transport  phenomena  in  several  high  speed 
devices  have  been  investigated.  A  detailed  understanding  of  unipolar  and 
bipolar  non-equilibrium  carrier  transport,  including  steady  state  and  transient 
ballistic  carrier  launching,  subject  to  self-consistent  space  charges  has  now 
been  established.  We  are  able  ,  for  example,  to  explain  why  fabricated  devices 
in  the  past  have  not  reached  expected  high  cut-off  frequencies  and  to  design 
optimum  devices  (layer  sequence,  materials  composition  and  geometrical 
dimensions)  for  highest  frequency  large  signal  operation.  From  this  work  it  is 
very  clear  that  the  full  exploitation  of  non-equilibrium  carrier  transport  in 
high  speed  compound  semiconductor  devices  requires  microscopic  insight 
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and  optimization  that  can  only  be  provided  by  sophisticated  particle  codes 
such  as  M^EDUSA. 

Free  carrier  screening  has  been  shown  to  be  a  critical  mechanism  in 
developing  accurate  models  of  carrier  relaxation  and  to  this  end  a  new 
dynamic  carrier  screening  model  has  been  developed.  It  much  more 
accurately  models  the  free  carrier  dielectric  functions  than  previous  static 
models.  Static  screening  appears  inadequate  in  modeling  femtosecond  carrier 
relaxation.  The  inclusion  of  dynamic  screening  provides  a  much  more 
accurate  understanding  of  the  microscopic  processes  involved.  Further,  the 
increased  importance  of  carrier-carrier  scattering,  when  combined  with 
dynamic  screening*  seems  to  require  that  the  effect  of  carrier-carrier  scattering 
be  reassessed  in  other  situations  where  highly  non-equilibrium  distribution 
functions  are  involved.  Such  conditions  are  found  in  state  of  the  art  high 
speed  heterojunction  devices. 
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OBIECTIVE 

This  proposal  is  concerned  with  parallel  adaptive  computational  schemes  for 
real-time  processing  of  data  collected  from  the  environment  which  rapidly 
changes  in  time.  A  basic  step  in  adaptive  processing  is  to  discard  a  portion  of 
the  "old"  data  which  no  longer  represents  the  environment,  add  new  data, 
and  then  "adapt"  the  current  knowledge  about  the  environment  according  to 
the  change  in  the  data.  Such  processing  arises  for  example  in  sensor  array 
processing.  Our  three  major  objectives  are:  (i)  development  of  strategies  for 
adding  and  deleting  information  from  the  covariance  matrix  in  multi¬ 
direction  beamforming  (least  squares),  (ii)  development  of  strategies  for 
tracking  the  eigenstructure  of  the  array  data  after  addition  and  deletion  of  data 
(covariance  differencing),  (iii)  evaluation  of  procedures  in  (i)  and  (ii)  on 
emerging  parallel  processor  architectures. 

DISCUSSION  QF  STATE-QF-THE-ART 

Least  squares  problems  are  ubiquitous  in  engineering,  science,  operations 
research,  etc.  The  linear  least  squares  problem  can  be  posed  as  a  problem  of 
finding  the  vector  x  which  minimizes  the  quadratic  form  (Ax  -  b)+  (Ax  -  b) 
where  A  and  b  are  given  data. 

In  applications  various  constraints  are  imposed  on  the  weight  vector  x. 
Typical  constraints  are  linear  equality  constraints,  linear  inequality 
constraints,  or  quadratic  constraints. 

The  method  of  choice  for  solving  full  rank  least  squares  equations  is  to 
proceed  by  a  unitary  transformation  Q  that  "compresses"  the  data  matrix  A  to 
the  "information  equivalent"  triangular  matrix  U.  This  triangular  matrix  is 
known  as  a  Cholesky  factor  of  AT  A.  The  desired  least  squares  solution  is  next 
determined  by  solving  the  corresponding  triangular  system  of  linear 
equations. 

In  recursive  least  squares  equation  the  minimization  problem  needs  to  be 
solved  repeatedly  after  some  rows  of  A  are  removed  and  additional  rows  are 
added.  This  happens  if,  for  example,  the  data  to  be  deleted  is  unrepresentative 
of  the  data  at  large  and  so  its  effects  on  the  weight  vector  (or  parameter 


estimate)  x  must  be  excised  (robust  statistics).  Or  perhaps  the  data  is  changing 
with  time  and  old  data  must  be  deleted  (adaptive  beamforming)  [8,  11].  The 
addition  and  the  deletion  are  known  as  updating  and  downdating  the 
Cholesky  factor,  respectively,  or  simply  as  a  modification  of  the  Cholesky 
factor. 

The  combined  process  of  updating  and  downdating  the  Cholesky  factor  is 
called  a  sliding  rectangular  window  process,  and  is  one  of  the  topics  of  the 
proposed  research. 

Processing  of  recursive  least  squares  problems  on  sequential  machines  is  now 
well  understood.  It  is  known  that  while  the  updating  process  is  numerically 
sound,  the  downdating  can  be  very  sensitive  to  rounding  errors  [3].  Thus  if 
the  problem  is  expected  to  be  ill-conditioned  downdating  requires  formation 
of  Q  and  downdating  Q  itself  ([7]).  This  however  results  in  quite  high 
computational  cost,  and  additional  memory  requirements  for  storing  Q.  If  ill- 
conditioning  is  not  expected  (this  can  be  checked  concurrently  with  processing 
the  data),  the  downdating  of  the  Cholesky  factor  can  be  realized  via  so-called 
Stabilized  Hyperbolic  Householder  scheme  [5]  (see  also  [14])  which  is  the  least 
expensive  (for  multiple  vector  updating/ downdating  problems),  in  terms  of 
number  of  operations  and  hence  the  preferable  method  for  sequential 
rectangular  sliding  window  process.  In  the  case  when  the  downdating  of  the 
covariance  matrix  results  in  the  sign  indefinite  matrix,  the  recently  proposed 
in  [12]  method  of  hyperbolic  singular  value  decomposition  can  be  used  to  deal 
with  this  sign  indefiniteness. 

In  parallel  computing  the  additional  cost  of  interprocessor  communication 
has  to  be  taken  into  account  in  assessing  the  cost  of  executing  algorithms. 
Most  discussions  surrounding  multiprocessor  computers  for  signal 
processing  have  centered  on  planar  (triangular)  arrays  [4,  6,  9,  10,  16,  15]. 
Perhaps  the  sole  exception  has  been  the  important  contribution  by  Rader  in 
[13].  Both  triangular  and  linear  arrays  considered  in  [10]  or  [13]  are  designed  to 
implement  efficiently  the  exponential  weighting  method.  The  exponential 
weighting  method  is  very  attractive  for  parallel  implementation  as  it  can  be 
realized  by  a  single  updating  process.  On  the  other  hand,  the  sliding  window 
process  is  a  composite  task  in  the  sense  that  each  recursive  step  involves 
updating  and  downdating  of  the  triangular  factor  followed  by  solving  the 
resulting  triangular  systems  of  linear  equations.  None  of  the  architectures 
proposed  in  [9]  or  [13]  can  efficiently  deal  with  the  sliding  window  process 
described  above.  A  preliminary  study  on  the  behavior  of  a  simple  variant  of 
the  sliding  window  process  on  a  linear  array  of  processors  have  been  recently 
reported  in  [2]. 

In  this  task  we  will  extend  the  problem  of  implementing  the  sliding 
rectangular  window  process  investigated  in  [2]  to  other  parallel  architectures 
and  more  general  least  squares  problems. 


32 


One  of  the  tasks  in  sensor  array  processing  is  to  compute  the  noise  subspace  of 
the  data  matrix  derived  from  the  array  of  sensors.  In  this  case  the  data  matrix 
can  be  considered  as  having  a  low  (numerical)  rank  and  then  the  standard 
recursive  updating/ downdating  approach  may  lead  to  unsatisfactory  results. 
One  way  to  circumvent  this  problem  is  by  computing  the  singular  value 
decomposition,  or  recently  proposed  URV  decomposition  of  A.  The 
advantage  of  the  URV  over  the  SVD  is  that  a  complete  decomposition  can  be 
updated  at  a  much  lower  cost  than  the  corresponding  SVD  decomposition. 
This  is  particularly  attractive  for  adaptive  processing  and  can  be  used  in  the 
recursive  least  squares  problems  [1, 17]. 

We  plan  to  address  the  question  under  what  conditions  matrix  updating 
techniques  for  the  URV  type  decompositions  are  preferable  to  restarting  the 
singular  value  decomposition  of  the  matrix  when  implemented  on  parallel 
architectures. 

PROGRESS 

New  task  started  September  30, 1991. 

SaENTIFIC  IMPACT  OF  RESEARCH 

The  research  in  this  task  will  aid  in  developing  real-time  sensor  arrays 
systems.  The  contributions  will  be  twofold.  Firstly,  new  highly  concurrent 
algorithms  amenable  to  efficient  parallel  implementation  will  be  proposed 
and  their  numerical  properties  will  be  analyzed.  Secondly,  a  parametric 
model  wiil  be  developed  so,  given  the  user's  specified  requirements,  solid 
recommendations  can  be  made  as  to  the  applicability  of  the  techniques 
discussed  in  this  proposal.  The  accuracy  of  the  model  will  be  thoroughly 
tested  on  existing  parallel  architectures.  Part  of  the  research  will  also  address 
the  question  of  fault  tolerance,  and  multiprocessor  organization  for  real  time 
signal  processing  systems. 

DEGREES  AWARDED 

New  task. 
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OBJECTIVE 

Multidimensional  signal  processing  in  the  context  of  processing  signals 
received  by  an  array  of  sensors  has  many  important  applications.  The  type  of 
filtering  that  can  be  conveniently  applied  to  signals  carried  by  propagating 
waves  is  beamforming,  which  seeks  to  isolate  signal  components  that  are 
propagating  in  a  particular  direction.  Although  computationally  expensive, 
the  beamforming  procedure  has  been  rapidly  rising  in  popularity  due  to 
advances  in  both  matrix  algorithms  and  systolic  arrays.  Most  systolic  arrays 
will,  be  deployed  in  harsh  environments  and  thus  susceptible  to  frequent 
transient  errors.  The  principal  objective  of  this  task  is  to  develop  systolic  fault 
tolerant  beamforming  techniques.  Special  attention  will  be  paid  to 
computing  complex  matrix  decompositions,  avoiding  numerical  overflows, 
differentiating  between  errors  arising  from  numerical  roundoff  buildups  and 
those  from  hardware  failures,  and  interrupting  the  operation  of  systolic  arrays 
for  error  correction. 

DISCUSSION  OF  STATE-OF-THE-ART 

The  compatibility  of  systolic  arrays  and  algorithms  with  both  matrix 
computations  and  today's  VLSI  and  wafer-scale  technology  guarantees  their 
future  use  as  key  components  in  any  signal  processing  system.  An  especially 
important  systolic  algorithm  is  the  orthogonal  triangulation  algorithm  (the 
QR  decomposition)  for  leas;  squares  minimization,  a  crucial  step  in  most 
adaptive  antenna  processing  algorithms.  The  importance  of  these  problems 
is  evidenced  by  two  major  systolic  array  projects;  one  at  MIT's  Lincoln 
Laboratory  [1]  and  the  other  at  the  United  Kingdom's  Royal  Signals  and  Radar 
Establishment  (RSRE)  [2].  However,  traditional  fault  tolerance  techniques 
such  as  modular  redundancy  have  been  regarded  as  too  costly  and  unwieldy 
to  implement  on  these  systolic  arrays.  In  [3],  a  JSEP  supported  work,  we 
presented  a  simple  fault  tolerance  scheme  for  the  QR  decomposition  and 
showed  how  it  can  be  easily  incorporated  into  the  RSRE  systolic  arrays  for 
recursive  least  squares  minimization.  Our  work,  scarcely  one  year  old,  has 
already  won  recognition  at  the  RSRE  as  a  possible  fault  tolerance  technique 
for  their  systolic  arrays  [4]. 


Data  matrices  that  are  ill-conditioned  call  for  a  more  robust  and  more 
expensive  numerical  technique  known  as  the  singular  value  decomposition 
(SVD).  An  SVD  systolic  array  designed  by  us  has  been  adopted  for  hardware 
implementation  at  both  the  RSRE  [5]  and  Computational  Engineering,  Inc.  [6]. 
The  implementation  of  the  latter  will  be  used  for  real  time  system  control;  its 
application  in  the  wing  flutter  analysis  of  supersonic  planes  has  been  proven 
in  a  wind  tunnel  test  at  an  Air  Force  Laboratory  in  Ohio.  The  problem  of 
fault  tolerant  computation  of  the  singular  value  decomposition  awaits  a  nice 
solution.  Schemes  were  reported  in  [7],  but  they  are  so  complicated  that  triple 
modular  redundancy  may  well  be  a  better  choice. 

Existing  fault  tolerance  schemes  have  often  been  ignored  by  systolic  array 
designers  because  they  are  too  costly  and  unwieldy  to  implement.  An 
attractive  new  idea  came  in  the  form  of  algorithm-based  fault  tolerance.  This 
approach  employs  three  steps;  encode  the  input  data,  execute  the  algorithm 
on  the  encoded  input  to  produce  encoded  output,  and  decode  the  output  to 
detect  and  perhaps  correct  errors.  Both  checksum  and  weighted  checksum 
encoding  schemes  have  been  developed  by  Abraham  et  al.  [8,  9],  who  showed 
that  a  variety  of  matrix  operations  preserves  the  checksum  property. 

In  [9,  10]  a  linear  algebraic  interpretation  of  the  weighted  checksum  scheme 
was  proposed.  Such  a  model  allows  parallels  to  be  drawn  between  algorithm- 
based  fault  tolerance  and  coding  theory,  and  makes  it  possible  to  examine  in 
detail  the  difficulties  in  choosing  weight  vectors  such  that  the  correction 
vector  can  be  explicitly  resolved.  The  hard  problem  of  how  to  determine  the 
exact  number  of  errors  that  have  occurred  has  been  solved  in  [11].  For  error 
correction,  prior  to  [11],  it  was  known  only  how  to  correct  a  weighted 
checksum  scheme  for  the  cases  of  one  error  [9]  and  two  errors  [10].  In  [11]  a 
theoretical  framework  was  given  which  would  enable  one  to  solve  the 
correction  problem  for  the  general  case. 

The  weighted  checksum  technique  has  been  demonstrated  to  be  effective  in 
multiple  error  detection.  It  has  been  shown  that,  in  order  to  guarantee  error 
detection,  the  chosen  weight  vectors  must  satisfy  some  very  specific 
properties  about  linear  independence.  Previously,  appropriate  sets  of  weight 
vectors  have  been  proposed  which  are  powers  of  integers  [9,  12];  these  suffer 
from  the  fact  that  the  weights  can  become  very  large.  In  [13, 14]  a  new  scheme 
was  presented  that  generates  weight  vectors  to  meet  the  requirements  about 
independence  and  to  avoid  the  difficulties  with  overflow. 

PROGRESS 

In  [15]  we  introduced  a  new  algc-'thm-based  fault  tolerance  technique 
specifically  designed  for  use  on  j.  sive  antenna  arrays.  This  work  is 
significant  in  that  it  is  joint  with  an  industrial  researcher  with  access  to  real 
world  data  and  problems. 


Error  correction  has  proved  to  be  a  much  more  difficult  problem  to  solve 
than  error  detection  when  using  weighted  checksums.  In  [16]  we  provided  a 
theoretical  basis  for  the  correction  problem  and  showed  how  the  correction 
procedure  can  be  greatly  simplified  via  the  Lanczos  recursion. 

To  avoid  numerical  overflows,  in  [8]  and  [9],  two  methods  were  proposed  that 
use  modular  arithmetic  to  compute  weighted  checksums.  A  new  scheme  was 
derived  by  us  in  [13, 14].  A  real  breakthrough  was  achieved  by  us  in  [17]  where 
we  showed  how  small  weights  can  be  derived  via  the  use  of  orthogonal 
polynomials. 

SaENTIFIC  IMPACT  OF  RESEARCH 

Our  work  is  making  a  significant  impact  in  that  it  is  getting  lots  of  attention 
so  that  many  researchers  are  attempting  to  improve  on  our  w  Vr  We  are 
most  proud  of  our  result  in  discovering  the  relationship  between  the  famous 
Berlekamp-Massey  algorithm  for  decoding  the  Reed-Solomon  code,  and  the 
well  known  Lanczos  algorithm  in  numerical  computing  [16]. 

DEGREES  AWARDED 

None. 
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OBTECTTVE 

The  objective  of  our  task  is  to  address  two  important  issues  that  confront  the 
deployment  of  superscalar  processors,  processors  with  multiple  functional 
units,  which  issue  and  execute  multiple,  and  possibly  out-of-order, 
instructions  from  an  instruction  stream,  for  real-time  signal  processing  tasks. 
The  two  vexing  issues  are: 

1.  Interrupt  handling:  A  critical  requirement  for  real-time  signal 
processing  computation  is  that  the  computer  system  has  to  be  able  to  provide 
prompt  and  precise  interrupt  handling  capabilities.  Interrupt  requests  have  to 
be  promptly  handled  because  tasks  that  initiate  these  requests  have  to  be 
processed  as  soon  as  possible.  Responding  to  an  interrupt  request,  the 
processor  first  stores  its  processor  state;  this  has  to  be  done  precisely  so  that  the 
interrupted  process  can  be  resumed  at  the  point  of  interruption  later. 

The  presence  of  multiple  functional  units  enables  the  concurrent  execution  of 
multiple  instructions  from  the  same  instruction  stream.  Since  these 
instructions  are  at  various  stage  of  execution,  it  is  a  challenging  task  to 
identify  and  then  store  a  precise  processor  state  quickly. 

2.  Branch  handling:  The  presence  of  conditional  branch  instructions 
invariably  introduces  disturbances  into  a  dynamic  instruction  stream  On  the 
other  hand,  branch  instructions  appear  quite  frequently  in  dynamic 
instruction  streams.  It  can  be  safely  stated  that  these  undesirable  effects  are 
magnified  in  computer  systems  with  multiple  functional  units. 

The  objective  of  the  proposed  investigation  is  to  seek  solutions  to  these  two 
problems,  which  are  important  for  real-time  signal  processing  systems  as  well 
as  for  general  applications. 

In  addition,  we  have  extended  our  investigation  into  a  new  and  exciting  area: 
the  processing  of  multiple  instruction  streams  with  a  single  superscalar 
processor.  This  means  that  multiple  signal  processing  tasks  can  be  processed 
efficiently  and  concurrently  with  one  chip. 


DISCUSSION  OF  STATE-OF-THE-ART 


An  important  and  indispensable  feature  of  any  processor  is  its  ability  to 
handle  properly  interrupts  and  exceptions,  which  can  be  classified  into  three 
types:  external  interrupts,  exception  traps,  and  software  traps.  External 
interrupts  are  generated  from  or  by  the  environment  —  such  as  the  processing 
of  a  newly  arrived  task.  Abnormalities  encountered  in  system  processing, 
such  as  division  by  zero,  overflow,  or  illegal  operations,  generate  exception 
traps.  Software  traps  are  instructions  which  initiate  interrupt  requests;  these 
traps  provide  a  means  of  controlling  and  monitoring  program  executions. 

When  an  interrupt  request  is  received,  the  processor  must  save  its  processor 
state,  then  load  and  execute  an  appropriate  interrupt  handler.  Upon 
completion  of  the  interrupt  handling  routine,  the  saved  processor  state  is 
restored,  and  the  interrupted  process  can  then  be  restarted. 

A  processor  state  should  contain  enough,  and  preferably  only  enough, 
information  so  that  the  interrupted  process  can  be  restarted  at  the  precise 
point  where  it  was  interrupted.  To  be  able  to  resume  an  interrupted  process, 
the  processor  state  should  consist  of  the  contents  of  the  general  purpose 
registers,  the  program  counter,  the  condition  register,  all  index  registers  and 
the  relevant  portion  of  the  main  memory. 

The  classical  approach  to  identifying  precisely  the  point  where  a  process  is 
interrupted  is  to  save,  among  other  vital  items,  the  address  of  a  specific 
instruction,  say  instruction  a,  when  the  processor  state  is  saved.  All 
instructions  that  precede  instruction  a  have  been  executed.  And  instruction  a 
and  those  that  follow  it  have  not.  Instruction  a  thus  provides  a  precise 
interrupt  point. 

For  superscalar  processors,  which  execute  instructions  concurrently  and 
possibly  out-of-order,  the  identification  of  a  precise  interrupt  point  when  an 
interrupt  request  is  made  may  become  very  costly. 

Ii  order  to  evaluate  interrupt  handling  schemes,  a  framework  must  be 
established.  Three  factors  have  been  identified: 

1)  Latency: 

An  interrupt  handling  approach  must  be  judged  by  the  latency 
between  the  receipt  of  an  interrupt  request  and  the  completion  of 
saving  the  processor  state.  Clearly,  any  acceptable  interrupt  handling 
scheme  should  yield  a  latency,  that  is  appropriate  for  the  interrupt 
request,  which  may  be  generated  internally  or  externally. 

2)  Component  Cost: 
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The  cost  of  additional  hardware  and  software  incurred  by  the 
installation  of  an  interrupt  handling  scheme  must  be  considered. 

3)  Performance  Degradation: 

The  presence  and  operation  of  an  interrupt  handling  scheme  may 
bring  about  performance  degradation;  its  extent  should  be  critically 
examined. 

There  are  three  sources  of  degradations: 

i.  Abort  —  In  response  to  an  interrupt  request,  some  instructions 
that  have  already  been  partially  or  even  completely  executed  are 
"aborted"; 

ii.  Execution  inhibition  —  the  need  to  maintain  a  "consistent" 
processor  state  prevents  some  instructions  which  have  been 
executed  out-of-order  from  depositing  their  results;  this  in  turn 
inhibits  the  execution  of  subsequent  instructions  which  use  these 
results  as  operands; 

iii.  Update  —  Certain  schemes,  such  as  checkpointing,  require 
run-time  continuous  updating  operations,  which  have  to  be 
performed  by  the  processor. 

The  CDC  and  CRAY  machines  [1,  2]  all  have  multiple  functional  units  and  do 
allow  instructions  executed  out-of-order.  They  generally  allow  instructions 
under  execution  to  complete  before  the  processor  state  is  stored;  a  penalty  in 
long  latency  is  consequently  exacted.  In  the  IBM  360/91  [3],  a  precise  interrupt 
is  supported  by  allowing  all  issued  instructions  to  complete  their  execution; 
this  results  in  considerable  latency.  If  an  imprecise  interrupt  is  generated,  the 
processor  state  of  the  system  is  lost  and  the  system  cannot  be  restarted 
precisely  at  the  interrupted  point. 

More  recently,  machines  —  which  allow  multiple  and  out-of-order  — 
instruction  issuances,  executions,  and  completions  have  been  proposed.  The 
HPSm  [4]  implements  the  "High  Performance  Substrate"  model  of  execution. 
In  order  to  respond  to  interrupt  requests,  checkpointing  has  been  proposed  to 
allow  precise  handling  [5].  In  such  a  scheme,  a  minimum  of  two  checkpoints 
and  hence  two  additional  processor  states  have  to  be  maintained. 

Clearly,  the  approach  proposed  for  HPS  will  degrade  system  performance, 
both  in  processor  speed,  and  in  the  time  required  to  restore  to  a  consistent 
processor  state  upon  receiving  an  interrupt  request.  The  speed  of  the  system 
will  be  slowed  down  by  the  movement  of  state  information  as  the  states 
change,  and  by  the  additional  read  instruction  which  must  precede  all 
instructions  which  alter  the  memory.  A  performance  penalty  has  to  be  taken 


to  correct  the  memory  to  a  consistent  state  when  an  interrupt  request  is 
received. 

Several  interesting  methods  were  presented  in  [6]  to  realize  the  classical 
precise  interrupts.  Again  certain  amount  of  performance  degradation  results. 

Branching  is  an  indispensable  ingredient  in  any  meaningful  program;  it 
however  injects  performance  damping  turbulences  into  the  instruction 
stream.  How  to  handle  conditional  branching  efficiently  remains  a  difficult 
challenge  for  computer  architects.  A  clear  survey  of  possible  techniques  in 
handling  conditional  branches  can  be  found  in  [7].  The  proposed  and 
implemented  systems  discussed  previously  do  not  approach  this  opportunity 
aggressively.  Pre-fetching,  small  and  tentative,  is  implemented  in  some. 
Checkpointing  can  again  be  applied  to  allow  instruction  execution  on  an 
assumed  path.  If  the  assumption  made  is  proven  incorrect,  a  consistent 
processor  state  can  be  restored  through  the  processor  state  corresponding  to 
the  checkpoints  implemented  [5].  In  most  cases,  the  supply  of  instructions  is 
usually  disrupted  by  the  presence  of  conditional  branch  instructions. 

Our  investigation  indicates  that  due  to  inter-instruction  dependencies  and 
branching  turbulences,  a  single  instruction  stream  may  not  be  able  to  make 
full  advantage  of  the  execution  resources  of  a  superscalar  processor.  The 
notion  that  such  a  processor  can  concurrently  execute  several  independent 
instruction  streams  is  NEW  and  exciting. 

PROGRESS 

We  have  made  considerable  progress  in  the  following  three  investigations: 

INTERRUPT  HANDLING:  We  have  improved  upon  the  "instruction  window" 
approach,  reported  previously,  to  implementing  efficient  and  prompt 
interrupt  handling. 

The  factors  that  must  be  considered  in  evaluating  the  effectiveness  of 
interrupt  handling  schemes  have  been  modified  to  be:  latency,  cost,  and 
performance  degradation.  We  have  introduced  a  new  parameter:  No  Return 
Point  (NRP),  which  provides  machine  designer  with  a  means  of  achieving 
flexible  responses  to  various  types  of  interrupts  and  exceptions.  Further,  the 
implementation  of  the  requisite  Instruction  Window  (IW)  has  been  studied 
in  detail. 

A  paper  presenting  the  results  has  been  accepted  for  publication  by  the  IEEE 
Trans,  on  Computers.  And  a  patent  application  has  been  been  pending  since 
January  1990. 

We  are  completing  the  study  of  a  Fast  Dispatch  Stack  (FDS)  system,  which  will 
provide  another  approach  to  fast,  precise  interrupt  handling. 


BRANCH  PREDICTION:  The  Fast  Dispatch  Stack  (FDS)  system  under  active 
investigation  also  will  facilitate  speculative  execution  —  Instructions 
preceding  and  following  one  or  more  predicted  conditional  branch 
instructions  may  issue  and  execute  to  achieve  high  performance.  When 
necessary,  their  effects  are  undone  in  one  machine  cycle.  In  other  words,  a 
processor  can  execute  speculatively  on  predicted  paths  to  gain  superior 
performance  and  the  penalty  for  incorrect  guesses  is  not  significant. 

MULTIPLE  STREAM  PROCESSING:  We  have  continued  our  work  on  boosting 
superscalar  performance:  the  processing  of  two  or  more  independent 
instruction  streams  on  a  superscalar  processor,  creating  an  MIMD  system.  A 
paper  has  been  presented  at  the  1991  International  Conference  on  Parallel 
Processing  in  August,  1991. 

SCIENTIFIC  IMPACT  OF  RESEARCH 

Our  task  has  addressed  several  issues  that  computer  designers  will  face  in  the 
next  few  years. 

We  have  made  considerable  progress  in  the  instruction  issuing  mechanism, 
interrupt  handling,  branch  prediction  and  multi-stream  processing.  These 
features  enhance  significantly  the  performances  of  superscalars  without 
raising  the  clock  rate.  And  we  believe  that  our  study  provides  timely  and 
much  needed  investigation  into  areas  that  are  vital  to  the  further 
development  of  such  systems. 

We  have  active  ongoing  discussions  with  IBM,  Intel  and  AMD.  Dr.  Harry 
Dwyer,  who  has  just  completed  his  degree  in  August  1991,  is  working  with 
IBM/ Austin  on  their  superscalar  processor  development. 
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